Statistical Mechanics of High-Dimensional Inference
نویسندگان
چکیده
To model modern large-scale datasets, we need efficient algorithms to infer a set of P unknown model parameters from N noisy measurements. What are fundamental limits on the accuracy of parameter inference, given finite signal-to-noise ratios, limited measurements, prior information, and computational tractability requirements? How can we combine prior information with measurements to achieve these limits? Classical statistics gives incisive answers to these questions as the measurement density α = N P →∞. However, these classical results are not relevant to modern high-dimensional inference problems, which instead occur at finite α. We formulate and analyze high-dimensional inference as a problem in the statistical physics of quenched disorder. Our analysis uncovers fundamental limits on the accuracy of inference in high dimensions, and reveals that widely cherished inference algorithms like maximum likelihood (ML) and maximum-a posteriori (MAP) inference cannot achieve these limits. We further find optimal, computationally tractable algorithms that can achieve these limits. Intriguingly, in high dimensions, these optimal algorithms become computationally simpler than MAP and ML, while still outperforming them. For example, such optimal algorithms can lead to as much as a 20% reduction in the amount of data to achieve the same performance relative to MAP. Moreover, our analysis reveals simple relations between optimal high dimensional inference and low dimensional scalar Bayesian inference, insights into the nature of generalization and predictive power in high dimensions, information theoretic limits on compressed sensing, phase transitions in quadratic inference, and connections to central mathematical objects in convex optimization theory and random matrix theory.
منابع مشابه
Statistical Mechanics of High Dimensional Inference Supplementary Material
2 Deriving inference error from the statistical physics of disordered systems 2 2.1 Replica equations at finite temperature . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 2.2 Replica equations in the low temperature limit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 2.3 Moreau envelope formulation of the replica equations . . . . . ....
متن کاملLarry Wasserman Low Assumptions , High Dimensions
These days, statisticians often deal with complex, high dimensional datasets. Researchers in statistics and machine learning have responded by creating many new methods for analyzing high dimensional data. However, many of these new methods depend on strong assumptions. The challenge of bringing low assumption inference to high dimensional settings requires new ways to think about the foundatio...
متن کاملHigh-dimensional inference with the generalized Hopfield model: principal component analysis and corrections.
We consider the problem of inferring the interactions between a set of N binary variables from the knowledge of their frequencies and pairwise correlations. The inference framework is based on the Hopfield model, a special case of the Ising model where the interaction matrix is defined through a set of patterns in the variable space, and is of rank much smaller than N. We show that maximum like...
متن کاملStatistical Inference for High Dimensional Data
STATISTICAL INFERENCE FOR HIGH DIMENSIONAL DATA
متن کاملDepartment of Statistics -statistics Seminar – Spring 2013 Title: "some Challenges and Results for Causal and Statistical Inference with Social Network Data"
Reproducibility is imperative for any scientific discovery. More often than not, modern scientific findings rely on statistical analysis of high-dimensional data. At a minimum, reproducibility manifests itself in stability of statistical results relative to “reasonable” perturbations to data and to the model used. Jacknife, bootstrap, and cross-validation are based on perturbations to data, whi...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2016